% $Id$

% Copyright (c) 2007-2015, Trustees of The Leland Stanford Junior University
% All rights reserved.
%
% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions are met:
%
% Redistributions of source code must retain the above copyright notice, this
% list of conditions and the following disclaimer.
% Redistributions in binary form must reproduce the above copyright notice,
% this list of conditions and the following disclaimer in the documentation
% and/or other materials provided with the distribution.
%
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
% AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
% ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
% LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
% CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
% SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
% INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
% CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
% POSSIBILITY OF SUCH DAMAGE.

\documentclass[11pt]{article}
\usepackage{fancyhdr}
\usepackage[dvips]{graphicx} 
\usepackage{amsmath,amssymb} 
\usepackage{epsfig}
\usepackage{calc}

\newcommand{\simname}{BookSim~}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Setup the margin sizes.

\evensidemargin = 0in
\oddsidemargin = 0in
\textwidth = 6.5in

\topmargin = -0.5in
\textheight = 9in

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\author{Nan Jiang, George Michelogiannakis, Daniel Becker, Brian Towles and William J. Dally}
\title{\simname 2.0 User's Guide}

\begin{document}

\maketitle
\tableofcontents

\pagestyle{fancy}
%\renewcommand{\chaptermark}[1]{\markboth{#1}{}}
\renewcommand{\sectionmark}[1]{\markright{\thesection\ #1}}
\fancyhf{} % delete current setting for header and footer
\fancyhead[LE,RO]{\bfseries\thepage}
\fancyhead[LO]{\bfseries\rightmark}
\fancyhead[RE]{\bfseries\leftmark}
\renewcommand{\headrulewidth}{0.5pt}
\renewcommand{\footrulewidth}{0.5pt}
\addtolength{\headheight}{0.5pt} % make space for the rule
\cfoot{\small\today}
\fancypagestyle{plain}{%
    \fancyhf{} % get rid of headers on plain pages
    \renewcommand{\headrulewidth}{0pt} % and the line
    \renewcommand{\footrulewidth}{0pt} % and the line
} 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newenvironment{opt_list}[1]{\begin{list}{}{\renewcommand{\makelabel}[1]%
{\texttt{##1}\hfil}\settowidth{\labelwidth}{\texttt{#1}}\setlength{\leftmargin}%
{\labelwidth+\labelsep}}}{\end{list}}

\section{Introduction}

This document describes the use of the \simname interconnection
network simulator.  The simulator is designed as a companion to the
textbook ``Principles and Practices of Interconnection Networks''
(PPIN) published by Morgan Kaufmann (ISBN: 0122007514) and it is
assumed that is reader is familiar with the material covered in that
text.

This user guide is fairly brief as, with most simulators, the best way
to learn and {\it understand} the simulator is to study the code.
Most of the simulator's components are designed to be modular so tasks
such as adding a new routing algorithm, topology, or router
microarchitecture should not require a complete redesign of the code.
Once you have downloaded the code, compiled it, and run a simple
example (Section~\ref{sec:get_started}), the more detailed examples of
Section~\ref{sec:examples} give a good overview of the capabilities of
the simulator.  A list of configuration options is provided in
Section~\ref{sec:config_params} for reference.

\section{Getting started}
\label{sec:get_started}

\subsection{Downloading and building the simulator}
\label{sec:download}

The latest official release of the simulator can be checked out from our subversion (SVN) repository. Make sure a subversion client is installed on your machine; under UNIX, you can use the following command to check out a working copy:

\begin{verbatim}
svn co https://nocs.stanford.edu/cgi-bin/svn.cgi/booksim2.0
\end{verbatim}

The simulator files should now be in the \texttt{booksim2.0/trunk/src/} directory.
The simulator itself is written in C++ and has been specifically tested with the GNU G++ compiler (version $\ge3$).
The front end of the simulator uses LEX and YACC generated parser to process the simulator configuration file; however, unless you plan on making changes to the front end parser, LEX and YACC are not needed.
The \texttt{Makefile} should be edited so that the first few lines reflect the correct paths to the tools for your particular system.
The default \texttt{Makefile} should work on the Stanford Leland machines.
Type \texttt{make} to build the simulator. 

A note for Windows users:
The above instructions have been tested to work with Cygwin 1.7.18.

\subsection{Running a simulation}
\label{sec:run_example}

The simulator is invoked using the following command line:
\begin{verbatim}
  ./booksim [configfile]
\end{verbatim}

The parameter \texttt{configfile} is a file that contains configuration information for the simulator.
So, for example, to simulate the performance of a simple $8 \times 8$ torus (8-ary 2-cube) network on uniform traffic, a configuration such as the one shown in Figure~\ref{fig:config_example} could be used.
This particular example configuration file can be found in the \texttt{examples/torus88} directory.

\begin{figure}
\begin{verbatim}
  // Topology
  topology = torus;
  k        = 8;
  n        = 2;

  // Routing
  routing_function = dim_order;

  // Flow control
  num_vcs = 4;

  // Traffic
  traffic        = uniform;
  injection_rate = 0.15;
\end{verbatim}
\caption{Example configuration file for simulating a 8-ary 2-cube
network.}
\label{fig:config_example}
\end{figure}

In addition to specifying the topology, the configuration file also contains basic information about the routing algorithm, flow control, and traffic.
This simple example uses dimension-order routing and four virtual channels.
The \texttt{injection\_rate} parameter is added to tell the simulator to inject (on average) 0.15 packets per simulation cycle per node.
Packet size defaults to a single flit.
Also, any line of the configuration file that begins with \texttt{//} is treated as a comment and ignored by the simulator.
A detailed list of configuration parameters is given in Section~\ref{sec:config_params}.
Any parameters not specified by the user will take on default values.
The default value for every parameter in the simulator is specified in the file \texttt{booksim\_config.cpp}.

\subsection{Simulation output}

Continuing our example, running the torus simulation produces the
output shown in Figure~\ref{fig:sim_output}.  Each simulation has
three basic phases: warm up, measurement, and drain.  The length of
the warm up and measurement phases is a multiple of a basic sample
period (defined by \texttt{sample\_period} in the configuration).  As
shown in the figure, the current latency and throughput (rate of
accepted packets) for the simulation is printed after each sample
period.  The overall throughput is determined by the lowest throughput
of all the destination in the network, but the average throughput is
also displayed.

\begin{figure}
\begin{verbatim}
BEGIN Configuration File: examples/torus88

....

// Topology
topology = torus;
k = 8;
n = 2;
// Routing
routing_function = dim_order;
// Flow control
num_vcs = 2;
// Traffic
traffic = uniform;
injection_rate = 0.15;

END Configuration File: examples/torus88
Class 0:
Packet latency average = 33.2807
	minimum = 7
	maximum = 73

....

Warmed up ...Time used is 2000 cycles

....

Draining all recorded packets ...
Draining remaining packets ...
Time taken is 5104 cycles
====== Overall Traffic Statistics ======
====== Traffic class 0 ======
Packet latency average = 33.6964 (1 samples)
	minimum = 7 (1 samples)
	maximum = 84 (1 samples)

....

Total run time 1.01837
\end{verbatim}
\caption{Simulator output from running the \texttt{examples/torus88}
configuration file.}
\label{fig:sim_output}
\end{figure}

After the warm up periods have passed (default to 3$\times$sample\_period), the simulator prints the
``\texttt{Warmed up}'' message and resets all the simulation statistics.
Then, the measurement phase begins and statistics continue to be
reported after each sample period.  The measurement phase 
typically last for last least 3 sample periods. 
Once the measurement periods have
passed, all the measurement packets are drained from the network
before final latency and throughput numbers are reported.  Details of
the configuration parameters used to control the length of the
simulation phases are covered in Section~\ref{sec:sim_params}.

\section{Example}
\label{sec:examples}

One of the most basic performance measures of any interconnection
network is its latency versus offered load.
Figure~\ref{fig:lat_vs_load} shows a simple configuration file for
making this measurement in a 8-ary 2-mesh network under the transpose
traffic pattern.  This configuration was used to generate a single data point in Figure 25.2
in PPIN.  The particular configuration accounts for some small delays
and pipelining of the input-queued router and also introduces a small
input speedup to account for any inefficiencies in allocation.  By
running simulations for many increments of \texttt{injection\_rate},
the average latency curve can be found.  Then, to compare the
performance of dimension-order routing against several other routing
algorithms, for example, the \texttt{routing\_function} option can be
changed.

\begin{figure}
\begin{verbatim}
// Topology

topology = mesh;
k = 8;
n = 2;

// Routing
routing_function = dor;

// Flow control
num_vcs     = 8;
vc_buf_size = 8;
wait_for_tail_credit = 1;

// Router architecture
vc_allocator = islip;
sw_allocator = islip;
alloc_iters  = 1;

credit_delay   = 2;
routing_delay  = 0;
vc_alloc_delay = 1;
sw_alloc_delay = 1;

input_speedup     = 2;
output_speedup    = 1;
internal_speedup  = 1.0;


// Traffic
traffic = transpose;
packet_size = 20;


// Simulation
sim_type = latency;

injection_rate = 0.005;

\end{verbatim}
\caption{A typical configuration file (\texttt{examples/mesh88\_lat})
for creating a latency versus offered load curve for a 8-ary 2-mesh
network.}
\label{fig:lat_vs_load}
\end{figure}

\section{Configuration parameters}
\label{sec:config_params}

All information used to configure a simulation is passed through a
configuration file as illustrated by the example in 
Section~\ref{sec:run_example}.  This section lists the major
configuration parameters for the simulator. Additional description can be found in the 
configuration parameter class file \texttt{booksim\_config.cpp}.
A user can incorporate additional options by changing the this file.

\subsection{Topologies}
\label{sec:topos}

The \texttt{topology} parameter determines the underlying topology of
the network. There is also a set of numerical parameters that
describes the size of the networks.  

\begin{opt_list}{topoparams}

\item[k] Network radix, the number of routers per dimension
\item[n] Network dimension
\item[c] Network concentration, the number of nodes sharing a single
  router. Typically set to 1, >1 only in networks that has
  concentration (i.e. cmesh).
\item[x] (NoC simulations only) The number of routers in the X
  dimension. Used to calculate channel latency between routers.  
\item[y] (NoC simulations only) The number of routers in the y
  dimension. Used to calculate channel latency between routers.  
\item[xr] (NoC simulations only) For networks that have c>1, the
  number of nodes in the x direction per router. Used to calculate
  channel latency between routers.  
\item[yr] (NoC simulations only) For networks that have c>1, the
  number of nodes in the y direction per router. Used to calculate
  channel latency between routers.  

\end{opt_list}


The channel latency of the network must be configured within the
source code of the topology files. All topologies by default have
channel latency of 1 cycles. Topologies available in \simname are: 

\begin{opt_list}{topologies}
\item[fly] A $k$-ary $n$-fly (butterfly) topology. The \texttt{k}
parameter determines the network's radix and the \texttt{n} parameter
determines the network's dimension. Note: a $k$-ary 1-fly is
essentially a single radix-k router, useful for testing.

\item[mesh] A $k$-ary $n$-mesh (mesh) topology. The \texttt{k}
parameter determines the network's radix and the \texttt{n} parameter determines
the network's dimension.


\item[torus] A $k$-ary $n$-cube (torus) topology.  The \texttt{k}
parameter determines the network's radix and the \texttt{n} parameter determines
the network's dimension.
\item[cmesh] Concentrated mesh topology is a A $k$-ary $n$-mesh
  topology with multiple nodes sharing a single router. The \texttt{c}
  determines the concentration. The cmesh topology has the option that
  turns on "express channels" as described in  by default these
  channels are turned off.  

\item[fat tree] Fat Tree topology with 3 levels. Nodes are routers are
  arranged in a tree structure but the number of links between levels
  stays constant. At the bottom \texttt{k} nodes shares a level 0
  router. 

\item[flattened butterfly] A topology based on the paper "Flattened
  butterfly: a cost-efficient topology for high-radix networks" ISCA
  2007 

\item[dragonfly] A topology based on the paper ``Technology-driven,
  highly-scalable dragonfly topology.'' ISCA 2008

\item[quad tree] A quad tree topology.

\item[tree 4]

\item[anynet] A topology based on an user input file specifying
  connectivity of nodes and routers. 

\end{opt_list}

%Both the \texttt{mesh} and \texttt{torus} topologies support the
%addition of random link failures with the \texttt{link\_failures}
%parameter.  The value of \texttt{link\_failures} determines the number
%of channels that are randomly removed from the topology and are thus
%no longer available for forwarding packets.  Moreover, the
%randomization for failed channels is controlled by selecting an
%integer value for the \texttt{fail\_seed} parameter --- a fixed seed
%gives a fixed set of failed channels, independent of other
%randomization in the simulation.  Also, note that only certain routing
%functions support this feature (see Section~\ref{sec:routing_algs}).

\subsection{Physical sub-networks}
\label{sec:physical_subnets}

The \texttt{physical\_subnetworks} parameter defines the number of
physical sub-networks present in the network (defaults to one). All
sub-networks receive the same configuration parameters and thus are
identical. Traffic sources maintain an injection queue for each
sub-network. The packet 
generation process is unaffected. It enqueues generated packets into
the proper sub-network queue according to a division function in the
traffic manager. At every cycle, flits at the head of each queue
attempt to be injected. Traffic destinations can eject one flit from
each sub-network each cycle. 


\subsection{Routing algorithms}
\label{sec:routing_algs}

The \texttt{routing\_function} parameter selects a routing algorithm
for the topology.  Many routing algorithms need multiple virtual
channels for deadlock freedom. In addition to \texttt{routefunc.cpp},
some topologies source files include additional routing
functions. Also, the simulator code is structured so that additional
routing algorithms can be added with minimal changes to the overall
simulator (see the \texttt{routefunc.cpp} file in the simulator's
source code). 

\subsection{Flow control}

The simulator supports basic virtual-channel flow control with
credit-based backpressure.  

\begin{opt_list}{flowcontrolparams}

\item[num\_vcs] The number of virtual channels per physical channel.

\item[vc\_buf\_size] The depth of each virtual channel in flits.


%voq is not currently enabled in the simulator !
%\item[voq] If non-zero, use virtual-output queuing.  With virtual
%output queuing, a separate virtual channel is assigned to each
%destination in the network.  This option is most useful when used with
%a non-interfering routing algorithm (Section~\ref{sec:routing_algs}).
  
\item[wait\_for\_tail\_credit] If non-zero, do not reallocate a virtual
channel until the tail flit has left that virtual channel.  This
conservative approach prevents a dependency from being formed between
two packets sharing the same virtual channel in succession.
\end{opt_list}

\subsection{Router organizations}

The simulator also supports two different router microarchitectures.
The input-queued router follows the general organization described in
PPIN while the event-driven router is modeled after the router used in
the Avici TSR and described in U.S. Patent 6,370,145.  The
microarchitecture is selected using the \texttt{router} option.  Also,
both routers share a small set of options.

\begin{opt_list}{routerparams}
\item[credit\_delay] The processing delay (in cycles) for a credit.
Does not include the wire delay for transmitting the credit.

\item[internal\_speedup] An arbitrary speedup of the internals of the
routers over the channel transmission rate.  For example, a speedup
1.5 means that, on average, 1.5 flits can be forwarded by the router
in the time required for a single flit to be transmitted across a
channel.  Also, the configuration parser expects a floating point
number for this field, so integer speedups should also include a
decimal point (e.g. ``2.0'').

%\item[output\_delay] The processing delay incurred in the output queue
%of a router.
\end{opt_list}

\subsubsection{The input-queued router}
\label{sec:iq_router}

The input-queued router (\texttt{router = iq}) follows the pipeline
described in PPIN of route computation, virtual-channel allocation,
switch allocation, and switch traversal.  There are several options
specific to the input-queued router.

\begin{opt_list}{iqrouterparams}

\item[input\_speedup] An integer speedup of the input ports in space.
A speedup of 2, for example, gives each input two input ports into the
crossbar.  Access to these ports is statically allocated based on the
virtual channel number: virtual channel $v$ at input $i$ is connected
to port $i \cdot s + (v \mod s)$ for an input speedup of $s$.

\item[output\_speedup] An integer speedup of the output ports in
space.  Similar to \texttt{input\_speedup}

\item[routing\_delay] The delay (in cycles) of route computation.

\item[hold\_switch\_for\_packet]

\item[speculative] Enable speculative switch allocation (i.e., allow
  switch allocation to occur in parallel with VC allocation for header
  flits). 

%\item[filter\_spec\_grants] Determines how speculative grants are masked (\texttt{any\_nonspec\_gnts}: any non-speculative grant inhibits all speculative grants; \texttt{confl\_nonspec\_reqs}: speculative grants are inhibited by conflicting non-speculative requests; \texttt{confl\_nonspec\_gnts}: speculative grants are inhibited by conflicting non-speculative grants).

\item[alloc\_iters] For the \texttt{islip}, \texttt{pim} and
  \texttt{select} allocators, allocation can be improved by performing
  multiple iterations of the algorithm; this parameter controls the
  number of iterations to be performed for both switch and VC
  allocation. 

\item[arb\_type] If the VC or switch  allocator is a separable
  input- or output-first allocator, this parameter selects the type of
  arbiter to use. 

\item[sw\_allocator] The type of allocator used for switch
  allocation. See Section~\ref{sec:alloc} for a list of the possible
  allocators. 

\item[sw\_alloc\_delay] The delay (in cycles) of switch allocation.

\item[vc\_allocator] The type of allocator used for virtual-channel
allocation.  See Section~\ref{sec:alloc} for a list of the possible
allocators.

\item[vc\_alloc\_delay] The delay (in cycles) of virtual-channel
allocation.

\end{opt_list}

\subsubsection{The event-driven router}
\label{sec:event_router}

The event-driven router (\texttt{router = event}) is a
microarchitecture designed specifically to support a large number of
virtual channels (VCs) efficiently.  Instead of continuously polling
the state of the virtual channels, as in the input-queued router, only
changes in VC state are tracked.  The efficiency then comes from the
fact that the number of state changes per cycle is constant and
independent of the number of VCs.

\subsection{Allocators}
\label{sec:alloc}

Many of the allocators used in the simulator are configurable (see
the input-queued router in Section~\ref{sec:iq_router}) and several
allocation algorithms are available.

\begin{opt_list}{allocators}

\item[max\_size] Maximum-size matching. 
\item[islip] iSLIP separable allocator.
\item[pim] Parallel iterative matching separable allocator.
\item[loa] Lonely output allocator.
\item[wavefront] Wavefront allocator.
\item[separable\_input\_first] Separable input-first allocator.
\item[separable\_output\_first] Separable output-first allocator.
\item[select] Priority-based allocator.  Allocation is performed as in
iSLIP, but with preference towards higher priority packets.
% (see \texttt{priority} option in Section~\ref{sec:traffic}).

\end{opt_list}

\subsection{Traffic}
\label{sec:traffic}


\subsubsection{Injection mode}
The rate at which packets are injected into the simulator is set using
the \texttt{injection\_rate} option.  The simulator's cycle time is a
flit cycle, the time it takes a single flit to be injected at a
source, and the injection rate is specified in packets per flit cycle.
For example, setting \texttt{injection\_rate = 0.25} means that each
source injects a new packet in one out of every four simulator cycles.
The unit of \texttt{injection\_rate} can optionally be changed to flits per cycle by 
setting \texttt{injection\_rate\_uses\_flits} to 1.

The injection process can further be specified as either Bernoulli process 
(\texttt{injection\_process = bernoulli}) or an on-off process
(\texttt{injection\_process = on\_off}).  The burstiness of the latter
is controlled via the \texttt{burst\_alpha} and
\texttt{burst\_beta} parameters.  See PPIN Section 24.2.2 for a
description of the on-off process and its parameters.

\subsubsection{Request-reply traffic}
By default, all packets that are injected into the network have the same, 
fixed length. The number of flits per packet is set using the
\texttt{const\_flits\_per\_packet} option. 

Alternatively, the simulator can be configured to generate request-reply 
traffic. In this mode, the traffic manager injects read and write requests 
into the network; when a destination node receives such a request 
packet, it returns a reply packet of the same type. Injection of reply 
packets has priority over injection of new requests packets. 
Consequently, the effective overall network load is generated by both 
the injected requests and the automatically generated replies. 
Request-reply traffic ignores the \texttt{const\_flits\_per\_packet} 
option; instead, packet sizes are determined by the 
\texttt{\{read|write\}\_\{request|reply\}\_size} options. Furthermore, 
the mapping of packet types to VCs can be customized using the 
\texttt{\{read|write\}\_\{request|reply\}\_\{begin|end\}\_vc} options.

\subsubsection{Traffic patterns}
The simulator also supports several different traffic patterns that
are specified using the \texttt{traffic} option.  To describe these
patterns, we use the same notation of PPIN Section 3.2: $s_i$ ($d_i$)
denotes the $i^\textrm{th}$ bit of the source (destination) address
whereas $s_x$ ($d_x$) denotes the $x^\textrm{th}$ radix-$k$ digit of
the source (destination) address.  The bit length of an address is $b
= \log_2 N$, where $N$ is the number of nodes in the network.

\begin{opt_list}{trafficpatterns}
\item[uniform] Each source sends an equal amount of traffic to each
destination (\texttt{traffic = uniform}).
\item[bitcomp] Bit complement. $d_i = \neg s_i$.
\item[bitrev] Bit reverse. $d_i = s_{b-i-1}$.
\item[shuffle] $d_i = s_{i-1 \mod b}$.
\item[transpose] $d_i = s_{i+b/2 \mod b}$.
\item[tornado] $d_x = s_x + \lceil k/2 \rceil - 1 \mod k$.
\item[neighbor] $d_x = s_x + 1 \mod k$.
\item[randperm] Random permutation.  A fixed permutation traffic
pattern is chosen uniformly at random from the set of all
permutations.  The seed used to generate this permutation is set by
the \texttt{perm\_seed} option.  So, randomly selecting values for
\texttt{perm\_seed} gives a random sampling of permutations while a
fixed value of \texttt{perm\_seed} allows the same permutation to be
used for several experiments.
\end{opt_list}

\subsection{Simulation parameters}
\label{sec:sim_params}

The duration and other aspects of a simulation are controlled using
the set of simulation parameters.

\begin{opt_list}{simparams}

\item[sim\_type] A simulation can either focus on
\texttt{throughput} or \texttt{latency}.  The key difference between
these two types is that a \texttt{latency} simulation will wait for
all measurement packets to drain before ending the simulation to
ensure an accurate latency measurement.  In \texttt{throughput}
simulations, this final drain step is eliminated to allow simulation
of networks operating beyond their saturation point.

\item[sample\_period] The sample period is expressed in simulator
cycles and is used as a multiplier when specifying the warm-up length
of a simulation and the maximum number of samples.  Also, intermediate
statistics are displayed once every \texttt{sample\_period} cycles. This is only applicable in injection mode.

\item[warmup\_periods] The length of the simulator warm up expressed
as a multiple of the \texttt{sample\_period}.  After warming up, all
statistics counters are reset. This is only applicable in injection mode.

\item[max\_samples] The total length of simulation expressed as a
multiple of the \texttt{sample\_period}. This is only applicable in injection mode.

\item[latency\_thres] If the sampled latency of the current simulation
exceeds \texttt{latency\_thres}, the simulation is immediately ended.

\item[sim\_count] The number of back-to-back simulations to run for the
given configuration.  Useful for creating ensemble averages of
particular statistics.

\item[seed] A random seed for the simulation.

%This is currently not setup in the traffic manager.
%\item[reorder] A non-zero value indicates that packet order should be
%maintained and reordering time is accounted for in the overall latency.

\item[print\_activity] At the end of a simulation using iq\_router, print out the activity for buffer, switch, and channel of the network. 

%\item[viewer\_trace] The simulator will generate very verbose print out of all activity inside the network. This print out should be fed into noc\_viewer for a graphic display of the activity inside the network. Currently not working. 

\item[watch\_file] Specific flits can have their "watch" status turn on. Require input a file which has flit id listed. 1 id per line. 

\end{opt_list}


\appendix
\section{Random number generation}

The simulator uses Knuth's integer and floating point pseudorandom
number generators.  These algorithms and their explanations appear in
``The Art of Computer Programming: Seminumerical Algorithms''.

\end{document}
